Hypotheses and Version Spaces

نویسندگان

  • Bernhard Ganter
  • Sergei O. Kuznetsov
چکیده

We consider the relation of a learning model described in terms of Formal Concept Analysis in [3] with a standard model of Machine Learning called version spaces. A version space consists of all possible functions compatible with training data. The overlap and distinctions of these two models are discussed. We give an algorithm how to generate a version space for which the classifiers are closed under conjunction. As an application we discuss an example from predicitive toxicology. The classifiers here are chemical compounds and their substructures. The example also indicates how the methodology can be applied to Conceptual Graphs. 1 Hypotheses and Implications Our paper deals with a standard model of Machine Learning called version spaces. But first, to make the paper self-contained, we introduce basic definitions of Formal Concept Analysis (FCA) [5]. We consider a set M of “structural attributes”, a set G of objects (or observations) and a relation I ⊆ G×M such that (g,m) ∈ I if and only if object g has the attribute m. Instead of (g,m) ∈ I we also write gIm. Such a triple K := (G,M, I) is called a formal context. Using the derivation operators, defined for A ⊆ G, B ⊆ M by A′ := {m ∈ M | gIm for all g ∈ A}, B′ := {g ∈ G | gIm for all m ∈ B}, we can define a formal concept (of the context K) to be a pair (A,B) satisfying A ⊆ G, B ⊆ M , A′ = B, and B′ = A. A is called the extent and B is called the intent of the concept (A,B). These concepts, ordered by (A1, B1) ≥ (A2, B2) ⇐⇒ A1 ⊇ A2 form a complete lattice, called the concept lattice of K := (G,M, I). Next, we use the FCA setting to describe a learning model from [2]. In addition to the structural attributes of M , we consider (as in [3]) a goal attribute w / ∈ M . This divides the set G of all objects into three subsets: The set G+ of those objects that are known to have the property w (these are the positive examples), the set G− of those objects of which it is known that they do not have w (the negative examples) and the set Gτ of undetermined examples, i.e., A. de Moor, W. Lex, and B. Ganter (Eds.): ICCS 2003, LNAI 2746, pp. 83–95, 2003. c © Springer-Verlag Berlin Heidelberg 2003 84 Bernhard Ganter and Sergei O. Kuznetsov of those objects, of which it is unknown if they have property w or not. This gives three subcontexts of K = (G,M, I): K+ := (G+,M, I+), K− := (G−,M, I−), and Kτ := (Gτ ,M, Iτ ), where for ε ∈ {+,−, τ} we have Iε := I ∩ (Gε ×M). Intents, as defined above, are attribute sets shared by some of the observed objects. In order to form hypotheses about structural causes of the goal attribute w, we are interested in sets of structural attributes that are common to some positive, but to no negative examples. Thus, a positive hypothesis for w is an intent of K+ that is not contained in the intent g− := {m ∈ M | (g,m) ∈ I−} of any negative example g ∈ G−. Negative hypotheses are defined accordingly. These hypotheses can be used as predictors for the undetermined examples. If the intent g′ := {m ∈ M | (g,m) ∈ I} of an object g ∈ Gτ contains a positive, but no negative hypothesis, then g is classified positively. Negative classifications are defined similarly. If g′ contains hypotheses of both kinds, or if g′ contains no hypothesis at all, then the classification is contradictory or undetermined, respectively. In [11] we argued that the consideration can be restricted to minimal (w.r.t. inclusion ⊆) hypotheses, positive as well as negative, since an object intent obviously contains a positive hypothesis if and only if it contains a minimal positive hypothesis, etc.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Version Space Support Vector Machines: An Extended Paper

We argue to use version spaces as an approach to reliable classification. The key idea is to construct version spaces containing the hypotheses of the target concept or of its close approximations. As a result the unanimous-voting classification rule of version spaces does not misclassify; i.e., instance classifications become reliable. We propose to implement version spaces using support vecto...

متن کامل

Agnostic Active Learning Without Constraints

We present and analyze an agnostic active learning algorithm that works without keeping a version space. This is unlike all previous approaches where a restricted set of candidate hypotheses is maintained throughout learning, and only hypotheses from this set are ever returned. By avoiding this version space approach, our algorithm sheds the computational burden and brittleness associated with ...

متن کامل

Agnostic Active Learning Without Constraints

We present and analyze an agnostic active learning algorithm that works withoutkeeping a version space. This is unlike all previous approaches where a restrictedset of candidate hypotheses is maintained throughout learning, and only hypothe-ses from this set are ever returned. By avoiding this version space approach, ouralgorithm sheds the computational burden and brittleness as...

متن کامل

Notes on Relation Between Symbolic Classifiers

Symbolic classifiers allow for solving classification task and provide the reason for the classifier decision. Such classifiers were studied by a large number of researchers and known under a number of names including tests, JSM-hypotheses, version spaces, emerging patterns, proper predictors of a target class, representative sets etc. Here we consider such classifiers with restriction on count...

متن کامل

ON COMPACTNESS AND G-COMPLETENESS IN FUZZY METRIC SPACES

In [Fuzzy Sets and Systems 27 (1988) 385-389], M. Grabiec in- troduced a notion of completeness for fuzzy metric spaces (in the sense of Kramosil and Michalek) that successfully used to obtain a fuzzy version of Ba- nachs contraction principle. According to the classical case, one can expect that a compact fuzzy metric space be complete in Grabiecs sense. We show here that this is not the case,...

متن کامل

Fixed points for total asymptotically nonexpansive mappings in a new version of bead ‎space‎

The notion of a bead metric space is defined as a nice generalization of the uniformly convex normed space such as $CAT(0)$ space, where the curvature is bounded from above by zero. In fact, the bead spaces themselves can be considered in particular as natural extensions of convex sets in uniformly convex spaces and normed bead spaces are identical with uniformly convex spaces. In this paper, w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003